Take Home Portion:
Consider the data in the file: FinalExamData.csv.
This file contains a column Xt and a column Zt.Your goal is to simply model this data using a vector autoregressive model (VAR) and a multi-layer perceptron model (MLP). You ultimately want to forecast Xt with a horizon of 10 using Zt if it is useful.
plotts.fore.wge <- function(test_ts, fore_ts, p_title=NULL, p_xtitle=NULL, p_ytitle=NULL ,plot=FALSE) {
if( plot == TRUE ) {
len <- length(test_ts)
x <- seq(1, len, by = 1)
y <- test_ts
ase <- mean((test_ts-fore_ts$f)^2)
p_title <- paste0(p_title," ASE - ",round(ase,2))
fig <- plot_ly()
fig <- fig %>% add_lines(x = x, y = y, color = I("black"), name = "observed" )
fig <- fig %>% add_lines(x = x, y = fore_ts$f, color = I("blue"), name = "prediction")
fig <- fig %>% add_ribbons(x = x, ymin = fore_ts$ll, ymax = fore_ts$ul, color = I("grey"), name = "95% confidence")
fig <- fig %>% layout(title = p_title, xaxis = list(title = p_xtitle), yaxis = list (title = p_ytitle))
fig
}
} Question a) :
Identify any relationship/association between Xt and Zt. Specifically, is there evidence of a relationship/association between Xt and a lagged Zt? … if so, what is the lag?) What evidence do you have to support this relationship/association?
Answer.
Below Plot shows Xt and Zt are correlated at lag 5. Correlation at lag 5 is 0.861.
##
## Autocorrelations of series 'X', by lag
##
## -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6
## 0.014 0.173 -0.223 0.078 -0.134 0.026 0.126 -0.056 -0.055 0.032 -0.270
## -5 -4 -3 -2 -1 0 1 2 3 4 5
## 0.861 -0.191 0.022 -0.118 -0.118 0.103 -0.003 -0.079 0.063 -0.170 0.156
## 6 7 8 9 10 11 12 13 14 15 16
## 0.098 -0.155 0.097 -0.005 0.128 0.023 0.014 -0.024 -0.013 -0.192 0.060
Question b.i
Provide the code you used to fit the models and make the predictions. (If you find the Zt’s useful, you will need to forecast them to use in making predictions… a very simple univariate model will do.)
Question b.ii.
Find the ASE for each model using the last 10 observations of the Xt dataset. Include your code for this as well.
Here is the code to build VAR Model
## $selection
## AIC(n) HQ(n) SC(n) FPE(n)
## 7 7 7 7
##
## $criteria
## 1 2 3 4 5 6
## AIC(n) 5.073068 5.136325 5.150271 5.004156 3.072408 2.922366
## HQ(n) 5.171771 5.284380 5.347678 5.250915 3.368520 3.267829
## SC(n) 5.320266 5.507123 5.644668 5.622153 3.814005 3.787562
## FPE(n) 159.695722 170.206101 172.759399 149.508116 21.713955 18.750579
## 7 8 9 10
## AIC(n) 2.462031 2.499070 2.573375 2.562035
## HQ(n) 2.856846 2.943237 3.066894 3.104906
## SC(n) 3.450826 3.611464 3.809369 3.921628
## FPE(n) 11.885771 12.405831 13.460990 13.430655
##
## VAR Estimation Results:
## =========================
## Endogenous variables: Xt, Zt
## Deterministic variables: both
## Sample size: 78
## Log Likelihood: -284.373
## Roots of the characteristic polynomial:
## 0.9497 0.9497 0.9031 0.9031 0.9022 0.9022 0.8767 0.8767 0.8739 0.8739 0.803 0.803 0.7598 0.7598
## Call:
## VAR(y = X, p = 7, type = "both")
##
##
## Estimation results for equation Xt:
## ===================================
## Xt = Xt.l1 + Zt.l1 + Xt.l2 + Zt.l2 + Xt.l3 + Zt.l3 + Xt.l4 + Zt.l4 + Xt.l5 + Zt.l5 + Xt.l6 + Zt.l6 + Xt.l7 + Zt.l7 + const + trend
##
## Estimate Std. Error t value Pr(>|t|)
## Xt.l1 0.594922 0.099826 5.960 1.30e-07 ***
## Zt.l1 -0.045087 0.107417 -0.420 0.6761
## Xt.l2 -0.624099 0.096548 -6.464 1.80e-08 ***
## Zt.l2 -0.036826 0.112595 -0.327 0.7447
## Xt.l3 -0.014268 0.033738 -0.423 0.6738
## Zt.l3 0.050674 0.110755 0.458 0.6489
## Xt.l4 0.022832 0.033174 0.688 0.4939
## Zt.l4 0.202555 0.112824 1.795 0.0775 .
## Xt.l5 0.049600 0.031658 1.567 0.1223
## Zt.l5 3.035013 0.115861 26.195 < 2e-16 ***
## Xt.l6 0.057014 0.032482 1.755 0.0842 .
## Zt.l6 -1.681839 0.320709 -5.244 2.01e-06 ***
## Xt.l7 0.012380 0.032745 0.378 0.7067
## Zt.l7 2.035433 0.316711 6.427 2.09e-08 ***
## const -8.512893 6.029131 -1.412 0.1630
## trend -0.002335 0.009031 -0.259 0.7968
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 1.559 on 62 degrees of freedom
## Multiple R-Squared: 0.9539, Adjusted R-squared: 0.9428
## F-statistic: 85.57 on 15 and 62 DF, p-value: < 2.2e-16
##
##
## Estimation results for equation Zt:
## ===================================
## Zt = Xt.l1 + Zt.l1 + Xt.l2 + Zt.l2 + Xt.l3 + Zt.l3 + Xt.l4 + Zt.l4 + Xt.l5 + Zt.l5 + Xt.l6 + Zt.l6 + Xt.l7 + Zt.l7 + const + trend
##
## Estimate Std. Error t value Pr(>|t|)
## Xt.l1 0.002723 0.116698 0.023 0.98146
## Zt.l1 -0.328601 0.125573 -2.617 0.01114 *
## Xt.l2 0.116790 0.112867 1.035 0.30480
## Zt.l2 -0.116573 0.131626 -0.886 0.37923
## Xt.l3 -0.006358 0.039440 -0.161 0.87246
## Zt.l3 -0.198717 0.129475 -1.535 0.12992
## Xt.l4 -0.069321 0.038781 -1.787 0.07875 .
## Zt.l4 -0.129934 0.131894 -0.985 0.32838
## Xt.l5 0.016957 0.037008 0.458 0.64842
## Zt.l5 -0.069548 0.135444 -0.513 0.60944
## Xt.l6 0.022519 0.037972 0.593 0.55531
## Zt.l6 -0.131994 0.374916 -0.352 0.72599
## Xt.l7 -0.067592 0.038279 -1.766 0.08236 .
## Zt.l7 -0.545246 0.370243 -1.473 0.14590
## const 24.019154 7.048193 3.408 0.00116 **
## trend 0.008416 0.010557 0.797 0.42842
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Residual standard error: 1.822 on 62 degrees of freedom
## Multiple R-Squared: 0.2855, Adjusted R-squared: 0.1127
## F-statistic: 1.652 on 15 and 62 DF, p-value: 0.08578
##
##
##
## Covariance matrix of residuals:
## Xt Zt
## Xt 2.4296 -0.3202
## Zt -0.3202 3.3203
##
## Correlation matrix of residuals:
## Xt Zt
## Xt 1.0000 -0.1127
## Zt -0.1127 1.0000
This is VAR Forecast with Fanchart. Fanchart shows Xt is forecasts for both variables accounting correlation between them. Xt is much is much better than Zt.
forecast <- list(f = preds$fcst$Xt[,1],ll=preds$fcst$Xt[,2],ul=preds$fcst$Xt[,3])
plotts.fore.wge(test$Xt,forecast,plot=TRUE)This is VAR ASE
Here is the code to build MLP Model
mlpDF = ts(final_ts[1:85,2:3])
fit.mlp = mlp(mlpDF[,1], reps= 100, comb = "mean",xreg = as.data.frame(mlpDF[,2]), lags=1:10)
fit.mlp## MLP fit with 5 hidden nodes and 100 repetitions.
## Univariate lags: (1,2,3,6)
## 1 regressor included.
## - Regressor 1 lags: (4,5,6,7,8,10)
## Forecast combined using the mean operator.
## MSE: 0.3723.
This is MLP forecast.
mlpTestDF = as.data.frame(ts(final_ts[,3]))
fore.mlp = forecast(fit.mlp, h = 10, xreg = mlpTestDF)
plot(fore.mlp)Ensemble is average of forecast from both VAR and MLP models.
| Model | ASE |
|---|---|
| VAR | 6.81 |
| MLP | 1.15 |
| Ensemble | 2.95 |
Question d) :
Make a quick statement about which of the 3 models you feel is most useful and why.
Answer :
VAR ASE is 6.81 and MLP is 1.15. MLP looks great with lowest ASE. But I would go ahead with Ensemble as it is taking average from both. Both methods use different approach to build model and average prediction from them likely to reduce generalization coming from individual models.
Question e) :
Using the model you thought was most useful, provide a plot of the 10 forecasts (time points 96 – 105) as well a list of the 10 forecasts. You do not need prediction intervals for this question.
Since Ensemble has been selected we need to take average from both models for future 10 predictions as follows.
get the forecast for 96-105 using VAR model. We get both \(X_{t}\) and \(Z_{t}\)
## fcst lower upper CI
## [1,] 30.41449 17.60230 43.22669 12.81219
## [2,] 28.88744 16.02330 41.75159 12.86414
## [3,] 31.01826 18.05716 43.97936 12.96110
## [4,] 29.21632 16.25450 42.17815 12.96182
## [5,] 29.98873 16.80968 43.16779 13.17905
## [6,] 30.68726 17.21733 44.15719 13.46993
## [7,] 28.65271 15.18029 42.12513 13.47242
## [8,] 30.16329 16.57829 43.74829 13.58500
## [9,] 30.29818 16.66751 43.92885 13.63067
## [10,] 30.15519 16.48236 43.82802 13.67283
## fcst lower upper CI
## [1,] 10.342489 6.314470 14.37051 4.028019
## [2,] 9.649623 5.621514 13.67773 4.028109
## [3,] 10.100107 6.028596 14.17162 4.071511
## [4,] 10.143779 6.046612 14.24095 4.097167
## [5,] 10.111416 6.008520 14.21431 4.102896
## [6,] 9.953890 5.847267 14.06051 4.106623
## [7,] 10.104139 5.996575 14.21170 4.107565
## [8,] 10.029459 5.916788 14.14213 4.112672
## [9,] 10.224849 6.105721 14.34398 4.119128
## [10,] 9.889032 5.768013 14.01005 4.121019
get the forecast for 96-105 using MLP model. We get both \(X_{t}\) and \(Z_{t}\)
zt_new_df <- rbind(data.frame(x=final_ts[,3]),data.frame(x=var_prediction$fcst$Zt[11:20,][,1]))
fore20.mlp = forecast(fit.mlp, h = 20, xreg = zt_new_df)
plot(fore20.mlp)## t+11 t+12 t+13 t+14 t+15 t+16 t+17 t+18
## 24.05010 24.14142 17.00253 31.03424 38.01539 30.82702 28.56353 28.27106
## t+19 t+20
## 28.53784 31.88707
forecast of 96:105 using Ensemble
ensemble20 = (fore20.mlp$mean[1:20] + var_prediction$fcst$Xt[1:20,1])/2
len <- length(final_ts$Xt)
x <- seq(1, len, by = 1)
y <- final_ts$Xt
fig <- plot_ly()
fig <- fig %>% add_lines(x = x, y = y, color = I("black"), name = "observed" )
fig <- fig %>% add_lines(x = seq(86,105,1), y = ensemble20, color = I("blue"), name = "prediction")
fig <- fig %>% layout(title = "Ensemble Model", xaxis = list(title = "Time"), yaxis = list (title = "Xt Series"))
figCode for Rolling Window ASE
| Model | ASE | Rolling ASE |
|---|---|---|
| VAR | 6.81 | 48.22527 |
| MLP | 1.15 | 52.41321 |
| Ensemble | 2.95 | 49.27432 |
trainSize = 20
horizon = 10
ASEHolder1 <- numeric()
ASEHolder2 <- numeric()
ASEHolder3 <- numeric()
X <-cbind(Xt=final_ts$Xt, Zt=final_ts$Zt)
for( i in 1:(95-(trainSize + horizon) + 1)) {
preds=predict(lsfit,n.ahead=10,newdata=final_ts$Xt[i:(trainSize+i-1)])
PM25Forcasts <- preds$fcst$Xt[,1:3]
ASE1 <- mean((X[(trainSize+i):(trainSize+i+horizon-1),1] - PM25Forcasts[,1])^2)
ASEHolder1[i] <- ASE1
fore.mlp <- forecast::forecast(fit.mlp,newdata=final_ts$Xt[i:(trainSize+i-1)], h = 10, xreg = mlpTestDF)
ASE2 = mean((final_ts$Xt[(i+trainSize):(trainSize+i+horizon-1)] - fore.mlp$mean)^2)
ASEHolder2[i] <- ASE2
ensemble = (fore.mlp$mean + preds$fcst$Xt[,1])/2
ASE3 = mean((final_ts$Xt[(i+trainSize):(trainSize+i+horizon-1)] - ensemble)^2)
ASEHolder3[i] <- ASE3
} ## [1] 48.22527
## [1] 52.21166
## [1] 49.24604
x <- c(1:66)
data <- data.frame(x, ASEHolder1)
fig <- plot_ly(data, x = ~x, y = ~ASEHolder1, type = 'scatter', mode = 'lines', name = 'VAR Model')
fig <- fig %>% add_trace(y = ~ASEHolder2, name = 'Neural Network', mode = 'lines+markers')
fig <- fig %>% add_trace(y = ~ASEHolder3, name = 'Ensemble', mode = 'lines+markers')
fig <- fig %>% layout(title = "Rolling ASE comparison",
xaxis = list(title = "Iteration (Window)"),
yaxis = list (title = "Average Squared Error"))
figRolling window ASE Suggests that VAR and ensemble are close enough. Neural network has higher ASE. I will still go ahead with Ensemble.